*Corresponding Author:
M. Wyatt Toure
e-mail: m_wyatt.toure@mail.mcgill.ca
Date of last update: May 26 2021
This page reports the analyses for the experiment described in ‘Colour biases in learned foraging preferences in Trinidadian guppies’. The raw data used to conduct these analyses are available in the colour-learning-project-data.csv file. Descriptions of the variables found in the dataset are given in the metadata section of the README file. The R script to reproduce the analysis and this site are in the analysis.Rmd file. The methods used to produce this data can be found on the Methods page.
We analysed the data from our experiment using linear mixed effect and
generalized linear mixed effect models with the lmer() and glmer() functions
from the lme4 package. P-values and effective degrees of freedom were obtained
using the lmerTest package. Model residuals were checked they met
distributional assumptions with the DHARMa package, you can click the ‘See
Model Residuals’ button below the model formulas to see the residual diagnostic
plots produced by DHARMa for that particular model.
This first model contains the data for all individual guppies. I looked at the
green object preference of all guppies in an intercept only model to see if the
green object preference at baseline was significantly different from zero.
green.object.preference is the time spent near the green object subtracted by
the time spent near the blue object
baseline_data_model <-
lm(green.object.preference ~ 1,
data = baseline_data
)
| Factor | Estimate | Std. Error | T statistic | P value |
|---|---|---|---|---|
| Intercept | 1.065 | 3.008 | 0.354 | 0.727 |
During the initial test, there was no significant preference for the green object across all guppies (p = 0.727).
Figure 1: Preference for the green object relative to the blue object across all guppies at baseline. Negative values represent more time spent with the blue object, positive values indicate more time spent with the green object. Data are means +/- 95% CI
To see whether fish were responsive during training our second model asks whether the preference for the rewarding object changes throughout training and whether the change in rewarding object preference is different between the treatments.
rewarding.object.preference is the time (seconds)
spent near the rewarding object subtracted by the time spent near the
unrewarding objectrewarding.object.colour is the identity of the rewarding
object (blue or green)trial is the number of the training trial. In this model
it is supplied as an integerid is the identity of the individual fishtraining_data_model <-
lmer(rewarding.object.preference ~ rewarding.object.colour * trial + (1 | id),
data = training_data
)
There is a slight deviation in the upper quantile but no indication in the residual plot of a gross model misfit.
| Factor | Estimate | Std. Error | T statistic | df | P value |
|---|---|---|---|---|---|
| Intercept | -11.017 | 19.490 | -0.565 | 48.463 | 0.575 |
| Reward object colour | 55.989 | 27.341 | 2.048 | 47.159 | 0.046 |
| Trial | 7.700 | 1.085 | 7.098 | 438.052 | < .001 |
| Rewarding object colour X Trial | -0.549 | 1.490 | -0.368 | 434.792 | 0.713 |
There was a significant effect of trial. Over the 20 training trials, guppies in the two treatments increased their relative preference for their respective rewarded objects by 7.7 seconds each trial (Figure 2, p < .001). There was also a significant effect of rewarded-object colour (p = 0.046). during training green-rewarded guppies expressed a stronger preference for their rewarded object (the green object) than did blue-rewarded guppies did for the blue object. However, there was no significant interaction effect between rewarding object colour and trial (p = 0.713), i.e., the rate of increase in object preference over trials did not significantly differ between the treatments.
Figure 2: Relative preference for the green object in both treatments during training trials (trials 1-20). Negative values represent more time spent with the blue object, positive values indicate more time spent with the green object. Light lines connect individuals across trials and bold lines represents a linear fit with 95% CI (grey shading). Subjects were consistently rewarded for approaching the blue object (dashed lines) or the green object (solid lines).
For the main effects of training and rewarding object colour on rewarding object
preference I fit a generalized linear mixed effects model with a Gaussian
distribution which modeled the variances to account for variance heterogeneity
using the package
glmmTMB. My
third model asks whether the preference for the rewarding object changed between
baseline and final test and looks for an interaction with rewarded object
colour.
rewarding.object.preference is the time (seconds)
spent near the rewarding object subtracted by the time spent near the
unrewarding objectrewarding.object.colour is the identity of the rewarding
object (blue or green)trial is the number of the training trial. In this model
it is supplied as a factor where 0 is the baseline trial and 21 is the final
trialid is the identity of the individual fishtest_data_model_glm <-
glmmTMB(rewarding.object.preference ~
trial * rewarding.object.colour + (1|id) +
diag(0 + rewarding.object.colour:trial |id),
data = test_data,
family = gaussian
)
| Factor | Estimate | Std. Error | T statistic | P value |
|---|---|---|---|---|
| Intercept | 1.842 | 3.011 | 0.612 | 0.541 |
| Trial | 17.697 | 10.017 | 1.767 | 0.077 |
| Rewarding object colour | 1.645 | 5.542 | 0.297 | 0.767 |
| Rewarding object colour X Trial | 65.351 | 20.832 | 3.137 | 0.002 |
When comparing the initial and final preference test, both conducted after training and without food rewards present, we found a significant interaction effect between test and rewarding object colour (p = 0.002). Guppies that had been green-rewarded had a shift in their rewarded object preference that was on average 65.4 seconds stronger than the shift in rewarded object preference of guppies trained to blue (Figure 3). These results were unaffected by the removal of one fish that did not feed during training (See ESM model 2).
Figure 3: Changes in object preference from an initial test before training to a final test after training. During training, fish were rewarded for approaching the blue object (blue squares and lines) or the green object (green squares and lines). At test, no food reward was present. Dashed line represents an equal preference for either object. Data are means ± 95% CI; lighter points and lines are data for each individual. The final preference for green-rewarded fish is stronger than that of fish that has been rewarded for approaching the blue object.
To determine whether the means of the final rewarding object preference for the
two treatments were different I conducted post-hoc comparisons with the package
emmeans. I compared the
following means:
test_data_model_emmeans <- emmeans(test_data_model_glm,
specs = ~ trial:rewarding.object.colour
)
# Making vectors to represent means of interest from emmeans() output
blue0 <- c(1, 0, 0, 0)
blue21 <- c(0, 1, 0, 0)
green0 <- c(0, 0, 1, 0)
green21 <- c(0, 0, 0, 1)
# Set seed to prevent confidence intervals from changing when code is re-run
set.seed(123)
custom_contrasts <- contrast(test_data_model_emmeans,
method = list(
"21 blue - 0 blue" = blue21 - blue0,
"21 green - 0 green " = green21 - green0,
"21 green - 21 blue" = green21 - blue21,
"0 green - 0 blue" = green0 - blue0
), adjust = "mvt"
) %>%
summary(infer = TRUE)
| Contrast | Estimate | Lower CL | Upper CL | df | P Value |
|---|---|---|---|---|---|
| 21 blue - 0 blue | 17.697 | -7.968 | 43.363 | 34 | 0.251 |
| 21 green - 0 green | 83.048 | 36.247 | 129.849 | 34 | < .001 |
| 21 green - 21 blue | 66.996 | 15.542 | 118.449 | 34 | 0.007 |
| 0 green - 0 blue | 1.645 | -12.555 | 15.845 | 34 | 0.988 |
Post-hoc comparisons (Table 2) reveal that initially, before training, there was no significant difference in the strength of preference for the rewarded object between the treatments (p = 0.988). The shift in rewarded object preference between the initial and final preference tests was significant for green-trained guppies but not for blue-trained guppies: green trained guppies increased their preference for the green object by 83 seconds from initial to final (p < .001) test whereas blue-trained guppies increased their preference for the blue object by 17.7 seconds an effect that was not statistically significant (p = 0.251). At final test, green-rewarded guppies had a significantly stronger preference for the previously rewarded object compared to the blue-rewarded guppies (p = 0.007).
A discrepancy in reinforcement between treatments may influence performance on a final preference test. To see whether there was a difference in feeding between treatments I counted the number of trials in which an individual fish ate throughout all of training and compared the feeding counts between treatments. To do this I fit a generalized linear model with a negative binomial distribution. The response variable ‘feeding count’ is a sum of the number of trials in which a guppy ate.
feeding.count is the number of trials in which an
individual fish aterewarding.object.colour is the identity of the rewarding
object (blue or green)feeding_data_model <-
glm.nb(feeding.count ~ rewarding.object.colour,
data = my_feeding_data
)
| Factor | Estimate | Std. Error | T statistic | P value |
|---|---|---|---|---|
| Intercept | 2.407 | 0.195 | 12.351 | 0.000 |
| Rewarding object colour | 0.071 | 0.263 | 0.270 | 0.787 |
We found no significant difference in the number of trials individuals fed between green-rewarded and blue-rewarded fish (Figure 4, p = 0.787). Removing the one individual that did not feed during training gave a similar result (p = 0.87). We also incorporated feeding count as a covariate in Model 3, finding the same pattern of results (See ESM Model 2).
Figure 4: Average number of trials in which a fish fed during training. Data are means ± 95% confidence intervals with probability density functions of the data (showing the negative binomial distribution) to the right of the raw data.
In this section we describe models not included in the main text.
The models described in this section were run to address the potential role of feeding count on test performance. The concern was that the results may be explained by differential levels of reinforcement between the groups or between individuals during training. To ensure our results were robust to matters arising from feeding count variation we:
There is a fish that did not eat during any trials, however removing this individual does not change the conclusions
test_feeding_data_low_feeders_removed <-
test_feeding_data %>% filter(feeding.count > 0)
test_feeding_data_low_feeders_removed_model <-
glmmTMB(rewarding.object.preference ~
trial * rewarding.object.colour + (1 | id) +
diag(0 + rewarding.object.colour:trial | id),
data = test_feeding_data_low_feeders_removed,
family = gaussian
)
| Factor | Estimate | Std. Error | T statistic | P value |
|---|---|---|---|---|
| Intercept | 0.445 | 3.004 | 0.148 | 0.882 |
| Trial | 24.090 | 9.694 | 2.485 | 0.013 |
| Rewarding object colour | 3.042 | 5.538 | 0.549 | 0.583 |
| Rewarding object colour X Trial | 58.958 | 20.679 | 2.851 | 0.004 |
With this next model we looked to see whether including the amount of trials an individual fed as a covariate in the model changes the conclusions.
rewarding.object.preference is the time (seconds)
spent near the rewarding object subtracted by the time spent near the
unrewarding objectrewarding.object.colour is the identity of the rewarding
object (blue or green)trial is the number of the training trial. In this model
it is supplied as an integerid is the identity of the individual fishfeeding.count is the number of trials in which an
individual fish atetest_data_feeding_controlled_model <-
glmmTMB(rewarding.object.preference ~
trial * rewarding.object.colour + feeding.count + (1 | id) +
diag(0 + rewarding.object.colour * trial | id),
data = test_feeding_data,
family = gaussian
)
| Factor | Estimate | Std. Error | T statistic | P value |
|---|---|---|---|---|
| Intercept | -3.642 | 7.215 | -0.505 | 0.614 |
| Trial | 17.697 | 9.553 | 1.852 | 0.064 |
| Rewarding object colour | 1.241 | 5.514 | 0.225 | 0.822 |
| Feeding count | 0.494 | 0.579 | 0.854 | 0.393 |
| Rewarding object colour X Trial | 65.352 | 19.847 | 3.293 | < .001 |
The main results do not change if I control for feeding count. The above table is the output feeding controlled model. Below we have the output table for the non-feeding-count controlled model from Model 3.
| Factor | Estimate | Std. Error | T statistic | P value |
|---|---|---|---|---|
| Intercept | 1.842 | 3.011 | 0.612 | 0.541 |
| Trial | 17.697 | 10.017 | 1.767 | 0.077 |
| Rewarding object colour | 1.645 | 5.542 | 0.297 | 0.767 |
| Rewarding object colour X Trial | 65.351 | 20.832 | 3.137 | 0.002 |
In both models the p-values are similar. While the effect of feeding count is not significant (p = 0.393) the effect of feeding count trends in the expected direction in our feeding count controlled model. As feeding count increases the preference for the rewarding object colour also increases.
A reviewer of our manuscript asks
Why the authors did not (also) exploit a percentage preference, which is commonly used?
To determine whether our results are robust despite our different measure we also ran an analysis with the percentage preference as ESM Model 3. The results from the main experiment remain unchanged when we do this.
Since our data are continuous proportions we used a generalized linear mixed effect model with a beta distribution.
prop_test_data_model_glm <-
glmmTMB(rewarding.object.preference.prop ~
trial * rewarding.object.colour + (1|id),
data = test_data,
family = beta_family(link="logit")
)
simulationOutput <- simulateResiduals(
fittedModel = prop_test_data_model_glm,
n = 1000
)
plot(simulationOutput)
| Factor | Estimate | Std. Error | T statistic | P value |
|---|---|---|---|---|
| Intercept | 0.096 | 0.191 | 0.504 | 0.615 |
| Trial | 0.162 | 0.271 | 0.599 | 0.549 |
| Rewarding object colour | -0.070 | 0.259 | -0.269 | 0.788 |
| Rewarding object colour X Trial | 1.025 | 0.379 | 2.702 | 0.007 |
Figure 5: Changes in proportional object preference from an initial test before training to a final test after training. During training, fish were rewarded for approaching the blue object (blue squares and lines) or the green object (green squares and lines). At test, no food reward was present. Dashed line represents an equal preference for either object. Data are means ± 95% CI; lighter points and lines are data for each individual. The final preference for green-rewarded fish is stronger than that of fish that has been rewarded for approaching the blue object.
prop_test_data_model_emmeans <- emmeans(prop_test_data_model_glm,
specs = ~ trial:rewarding.object.colour, type = "response"
)
# Set seed to prevent confidence intervals from changing when code is re-run
set.seed(123)
prop_custom_contrasts <- contrast(prop_test_data_model_emmeans,
method = list(
"21 blue - 0 blue" = blue21 - blue0,
"21 green - 0 green " = green21 - green0,
"21 green - 21 blue" = green21 - blue21,
"0 green - 0 blue" = green0 - blue0
), adjust = "mvt"
) %>%
summary(infer = TRUE)
| Contrast | Odds ratio | Lower CL | Upper CL | df | P Value |
|---|---|---|---|---|---|
| 21 blue / 0 blue | 1.176 | 0.588 | 2.355 | 38 | 0.913 |
| 21 green / 0 green | 3.278 | 1.659 | 6.478 | 38 | < .001 |
| 21 green / 21 blue | 2.600 | 1.277 | 5.291 | 38 | 0.005 |
| 0 green / 0 blue | 0.933 | 0.481 | 1.809 | 38 | 0.991 |
This is a model I (Wyatt) ran just because I was curious.
This model asks whether the time spent near the rewarding object during a training session is influenced by whether a fish ate or not. Here trial is a variable containing the training trials (1-20). It is supplied as a random effect along with individual ID in the model.
rewarding.object.preference is the time
(seconds) spent near the rewarding object subtracted by the time spent near
the unrewarding objectate is a binary factor (Yes or No) and corresponds to
whether the fish ate in a given trialtrial is the number of the training trial. In this model
it is supplied as an integerid is the identity of the individual fishtraining_data_model_rewarding_object <-
lmer(rewarding.object.preference ~ ate + (1 | id) + (1 | trial),
data = training_data
)
There is a significant deviation from uniformity as indicated by the significant Kolmogorov-Smirnov test. However, this model has a particularly large sample size (n = 456) so even slight deviations will be significant. Looking at the effect size of the deviation (D = 0.0694912) shows that it is minor (D < 0.1) and visual inspection does not suggest large deviations in the residuals so our model is still appropriate.
| Factor | Estimate | Std. Error | T statistic | df | P value |
|---|---|---|---|---|---|
| Intercept | 42.532 | 11.918 | 3.569 | 36.95 | 0.001 |
| Ate | 91.185 | 10.953 | 8.325 | 223.75 | < .001 |
Throughout all of training, fish that ate during a session spent on average 91.2 more seconds near the rewarded object compared to fish that did not eat (Figure, 6, p < .001). Fish that spent more trials eating may therefore have received more reinforcement for the object-food association. This result is what led me to investigate whether there was a difference in food reinforcement between the two treatments in Model 4.
Figure 6: Preference for the rewarding object during training based on whether an individual ate during a trial or not. Dashed line represents the no preference value. In panel A, Data are means ± 95% CI. In panel B, the data consist of all the feeding choices of all individuals across all training trials. Data are means ± 95% CI with probability density functions to the right of the raw data.
A complete list of the tools used is produced below
| Package | Version | Reference |
|---|---|---|
| broom | 0.5.5 | David Robinson and Alex Hayes (2020). broom: Convert Statistical Analysis Objects into Tidy Tibbles. R package version 0.5.5. https://CRAN.R-project.org/package=broom |
| broom.mixed | 0.2.6 | Ben Bolker and David Robinson (2020). broom.mixed: Tidying Methods for Mixed Models. R package version 0.2.6. https://CRAN.R-project.org/package=broom.mixed |
| carData | 3.0.3 | John Fox, Sanford Weisberg and Brad Price (2019). carData: Companion to Applied Regression Data Sets. R package version 3.0-3. https://CRAN.R-project.org/package=carData |
| cowplot | 1.0.0 | Claus O. Wilke (2019). cowplot: Streamlined Plot Theme and Plot Annotations for ‘ggplot2’. R package version 1.0.0. https://CRAN.R-project.org/package=cowplot |
| DHARMa | 0.3.3.0 | Florian Hartig (2020). DHARMa: Residual Diagnostics for Hierarchical (Multi-Level / Mixed) Regression Models. R package version 0.3.3.0. http://florianhartig.github.io/DHARMa/ |
| dplyr | 1.0.3 | Hadley Wickham, Romain François, Lionel Henry and Kirill Müller (2021). dplyr: A Grammar of Data Manipulation. R package version 1.0.3. https://CRAN.R-project.org/package=dplyr |
| effects | 4.1.4 | John Fox and Sanford Weisberg (2019). An R Companion to Applied Regression, 3rd Edition. Thousand Oaks, CA http://tinyurl.com/carbook |
| emmeans | 1.5.1 | Russell Lenth (2020). emmeans: Estimated Marginal Means, aka Least-Squares Means. R package version 1.5.1. https://CRAN.R-project.org/package=emmeans |
| ggplot2 | 3.3.3 | H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016. |
| ggpubr | 0.2.5 | Alboukadel Kassambara (2020). ggpubr: ‘ggplot2’ Based Publication Ready Plots. R package version 0.2.5. https://CRAN.R-project.org/package=ggpubr |
| glmmTMB | 1.0.0 | Mollie E. Brooks, Kasper Kristensen, Koen J. van Benthem, Arni Magnusson, Casper W. Berg, Anders Nielsen, Hans J. Skaug, Martin Maechler and Benjamin M. Bolker (2017). glmmTMB Balances Speed and Flexibility Among Packages for Zero-inflated Generalized Linear Mixed Modeling. The R Journal, 9(2), 378-400. |
| knitr | 1.30 | Yihui Xie (2020). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.30. |
| lme4 | 1.1.21 | Douglas Bates, Martin Maechler, Ben Bolker, Steve Walker (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1-48. doi:10.18637/jss.v067.i01. |
| lmerTest | 3.1.1 | Kuznetsova A, Brockhoff PB, Christensen RHB (2017). “lmerTest Package:Tests in Linear Mixed Effects Models.” Journal of StatisticalSoftware, 82(13), 1-26. doi: 10.18637/jss.v082.i13 (URL:https://doi.org/10.18637/jss.v082.i13). |
| magrittr | 2.0.1 | Stefan Milton Bache and Hadley Wickham (2020). magrittr: A Forward-Pipe Operator for R. R package version 2.0.1. https://CRAN.R-project.org/package=magrittr |
| MASS | 7.3.51.4 | Venables, W. N. & Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth Edition. Springer, New York. ISBN 0-387-95457-0 |
| Matrix | 1.2.18 | Douglas Bates and Martin Maechler (2019). Matrix: Sparse and Dense Matrix Classes and Methods. R package version 1.2-18. https://CRAN.R-project.org/package=Matrix |
| R | 3.6.2 | R Core Team (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/. |
| report | 0.2.0 | Makowski, D., Ben-Shachar, M.S., Patil, I. & Lüdecke, D. (2020). Automated reporting as a practical tool to improve reproducibility and methodological best practices adoption. CRAN. Available from https://github.com/easystats/report. doi: . |
| tidyext | 0.3.6 | Michael Clark (2021). tidyext: Tidy Extensions for Data Processing. https://m-clark.github.io/tidyext, https://github.com/m-clark/tidyext. |
| tidyr | 1.0.2 | Hadley Wickham and Lionel Henry (2020). tidyr: Tidy Messy Data. R package version 1.0.2. https://CRAN.R-project.org/package=tidyr |